Skip to content

fix(post-review): M/L cleanup from deep code review#57

Merged
aksOps merged 1 commit into
mainfrom
fix/post-review-ml-cleanup
Apr 28, 2026
Merged

fix(post-review): M/L cleanup from deep code review#57
aksOps merged 1 commit into
mainfrom
fix/post-review-ml-cleanup

Conversation

@aksOps

@aksOps aksOps commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

Summary

Second wave of follow-ups from the deep code review of PRs #49#55. PR1 (#56) shipped the High/Critical fixes; this PR cleans up the Medium/Low items that survived prioritization.

Changes

  • tsdb (M3): Aggregator.Ingest now releases mu before firing the cardinality-overflow callback. Today the callback is a Prometheus increment (atomic, fast), but holding the lock across an external function call is a footgun for any future hook that touches I/O. Pattern: capture target tenant under the lock, invoke after Unlock.
  • storage (M6): pgLogsRelkind uses errors.Is(err, sql.ErrNoRows) instead of strings.Contains(err.Error(), \"no rows\"). Robust against driver wrapping or message changes.
  • storage (L3): Repository.logsPartitioned is now atomic.Bool. The plain bool worked because the writer ran during NewRepository before any reader goroutine started, but the contract was brittle and no test would have caught a torn read on weakly-ordered architectures.
  • config (L1): Validate() rejects negative MCP_MAX_CONCURRENT / MCP_CALL_TIMEOUT_MS / MCP_CACHE_TTL_MS. 0 remains the documented "disable" sentinel; negatives are typos that should fail loud at startup, not silently clamp.
  • mcp (L2): SetCallLimit doc upgraded to flag the function as startup-only — runtime resize would leak a slot in the old channel.

Skipped (with rationale)

  • M1 Submit TOCTOU on closed pipeline — cosmetic only, current ordering is documented.
  • M2 ring/onIngest setter races — fixing properly requires an API change; benign during normal startup-only usage.
  • M4 FTS5 trigger throughput — needs a bulk-rebuild migration path, not a one-line tweak.
  • M5 isQueueFull scope — hypothetical concern with no observed symptom; revisit if metrics show drift.

Test plan

  • go vet ./... clean
  • go build ./... succeeds
  • go test ./... -race -count=1 — 404 tests pass across 27 packages
  • tsdb tests cover the new unlock-then-callback ordering (existing TestAggregator_* suite passes under -race)
  • config validation rejects negative values for the three MCP knobs (covered by existing TestConfig_Validate*)
  • commit signed (ED25519)

🤖 Generated with Claude Code

Five small follow-ups from the second-pass review of PRs #49#55:

- tsdb: fire cardinality-overflow callback AFTER releasing the
  Aggregator mutex. The callback is currently a Prometheus
  increment (atomic) but holding mu across an external function
  call is a footgun for any future hook. Capture the tenant
  under lock; invoke after Unlock.
- storage: use errors.Is(err, sql.ErrNoRows) in pgLogsRelkind
  instead of strings.Contains(err.Error(), "no rows"). Robust
  against driver wrapping.
- storage: convert Repository.logsPartitioned from plain bool
  to atomic.Bool. Removes the memory-model fragility of "the
  writer ran first" — read by retention.go from a separate
  goroutine.
- config: reject negative MCP_MAX_CONCURRENT / MCP_CALL_TIMEOUT_MS
  / MCP_CACHE_TTL_MS at Validate(). 0 stays the documented
  "disable" sentinel; negatives are typos that should fail loud.
- mcp: upgrade SetCallLimit doc to flag it startup-only — runtime
  resize leaks a slot in the old channel.

Skipped (with rationale, not silently dropped):
- M1 Submit TOCTOU on closed pipeline — cosmetic only, current
  ordering is documented.
- M2 ring/onIngest setter races — would require API change to
  fix properly; benign during normal startup-only usage.
- M4 FTS5 trigger throughput — needs a bulk-rebuild path, not
  a one-line tweak.
- M5 isQueueFull scope — hypothetical concern with no observed
  symptom; revisit only if metrics show drift.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sonarqubecloud

Copy link
Copy Markdown

@aksOps aksOps merged commit 75885db into main Apr 28, 2026
17 checks passed
@aksOps aksOps deleted the fix/post-review-ml-cleanup branch April 28, 2026 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant